99 research outputs found

    Evolutionary optimization of neural networks with heterogeneous computation: study and implementation

    Full text link
    In the optimization of artificial neural networks (ANNs) via evolutionary algorithms and the implementation of the necessary training for the objective function, there is often a trade-off between efficiency and flexibility. Pure software solutions on general-purpose processors tend to be slow because they do not take advantage of the inherent parallelism, whereas hardware realizations usually rely on optimizations that reduce the range of applicable network topologies, or they attempt to increase processing efficiency by means of low-precision data representation. This paper presents, first of all, a study that shows the need of heterogeneous platform (CPU–GPU–FPGA) to accelerate the optimization of ANNs using genetic algorithms and, secondly, an implementation of a platform based on embedded systems with hardware accelerators implemented in Field Pro-grammable Gate Array (FPGA). The implementation of the individuals on a remote low-cost Altera FPGA allowed us to obtain a 3x–4x acceleration compared with a 2.83 GHz Intel Xeon Quad-Core and 6x–7x compared with a 2.2 GHz AMD Opteron Quad-Core 2354.The translation of this paper was funded by the Universitat Politecnica de Valencia, Spain.Fe, JD.; Aliaga Varea, RJ.; Gadea Gironés, R. (2015). Evolutionary optimization of neural networks with heterogeneous computation: study and implementation. The Journal of Supercomputing. 71(8):2944-2962. doi:10.1007/s11227-015-1419-7S29442962718Farmahini-Farahani A, Vakili S, Fakhraie SM, Safari S, Lucas C (2010) Parallel scalable hardware implementation of asynchronous discrete particle swarm optimization. Eng Appl Artif Intell 23(2):177–187Curteanu S, Cartwright H (2011) Neural networks applied in chemistry. i. Determination of the optimal topology of multilayer perceptron neural networks. J Chemom 25(10):527–549. doi: 10.1002/cem.1401Islam MM, Sattar MA, Amin MF, Yao X, Murase K (2009) A new adaptive merging and growing algorithm for designing artificial neural networks. Ieee Trans Syst Man Cybern Part B-Cybern 39(3):705–722Han KH, Kim JH (2004) Quantum-inspired evolutionary algorithms with a new termination criterion, h-epsilon gate, and two-phase scheme. Ieee Trans Evol Comput 8(2):156–169Leung FHF, Lam HK, Ling SH, Tam PKS (2003) Tuning of the structure and parameters of a neural network using an improved genetic algorithm. Ieee Trans Neural Netw 14(1):79–88Tsai JT, Chou JH, Liu TK (2006) Tuning the structure and parameters of a neural network by using hybrid taguchi-genetic algorithm. Ieee Trans Neural Netw 17(1):69–80Ludermir TB, Yamazaki A, Zanchettin C (2006) An optimization methodology for neural network weights and architectures. Ieee Trans Neural Netw 17(6):1452–1459Palmes PP, Hayasaka T, Usui S (2005) Mutation-based genetic neural network. Trans Neural Netw 16(3):587–600. doi: 10.1109/TNN.2005.844858Mu T, Jiang J, Wang Y, Goulermas JY (2012) Adaptive data embedding framework for multiclass classification. Ieee Trans Neural Netw Learn Syst 23(8):1291–1303Lu T-C, Yu G-R, Juang J-C (2013) Quantum-based algorithm for optimizing artificial neural networks. IEEE Trans Neural Netw Lear Syst 24(8):1266–1278Yao X (1999) Evolving artificial neural networks. Proc Ieee 87(9):1423–1447Yao X, Liu Y (1997) A new evolutionary system for evolving artificial neural networks. Ieee Trans Neural Netw 8(3):694–713Mateo F, Sovilj D, Gadea-Gironés R (2010) Approximate k-NN delta test minimization method using genetic algorithms: application to time series. NEUROCOMPUTING 73(10–12, Sp):2017–2029Hawkins S, He H, Williams G, Baxter R (2002) Outlier detection using replicator neural networks. In: Proceedings of the 5th international conference and data warehousing and knowledge discovery. DaWaK02, pp 170–180Fe J, Aliaga RJ, Gironés RG (2013) Experimental platform for accelerate the training of anns with genetic algorithm and embedded system on fpga. In: IWINAC (2), pp 413–420Prechelt L (1994) Proben1—a set of neural network benchmark problems and benchmarking rules. Technical reportAbbass HA (2002) An evolutionary artificial neural networks approach for breast cancer diagnosis. Artif Intell Med 25:265–281Ahmad F, Isa NAM, Hussain Z, Sulaiman SN (2013) A genetic algorithm-based multi-objective optimization of an artificial neural network classifier for breast cancer diagnosis. Neural Comput Appl 23(5):1427–1435Sankaradas M, Jakkula V, Cadambi S, Chakradhar S, Durdanovic I, Cosatto E, Graf H (2009) A massively parallel coprocessor for convolutional neural networks. In: Application-specific systems, architectures and processors, 2009. ASAP 2009. 20th IEEE international conference on, July, pp 53–60Prado R, Melo J, Oliveira J, Neto A (2012) Fpga based implementation of a fuzzy neural network modular architecture for embedded systems. In: Neural networks (IJCNN), The 2012 international joint conference on, June, pp 1–7Çavuşlu M, Karakuzu C, Sahin S, Yakut M (2011) Neural network training based on fpga with floating point number format and its performance. Neural Comput Appl 20:195–202. doi: 10.1007/s00521-010-0423-3Wu G-D, Zhu Z-W, Lin B-W (2011) Reconfigurable back propagation based neural network architecture. In: Integrated circuits (ISIC), 2011 13th international symposium on, Dec, pp 67–70Pinjare SL, Kumar A (2012) Implementation of neural network back propagation training algorithm on fpga. Int J Comput Appl 52(6): 1–7, August, published by Foundation of Computer Science, New York, USAhttp://www.altera.comAliaga R, Gadea R, Colom R, Cerda J, Ferrando N, Herrero V (2009) A mixed hardware–software approach to flexible artificial neural network training on fpga. In: Systems, architectures, modeling, and simulation, 2009. SAMOS ’09. International symposium on, July, pp 1–8http://www.matlab.co

    Optimization of Deep Neural Networks Using SoCs with OpenCL

    Full text link
    [EN] In the optimization of deep neural networks (DNNs) via evolutionary algorithms (EAs) and the implementation of the training necessary for the creation of the objective function, there is often a trade-off between efficiency and flexibility. Pure software solutions implemented on general-purpose processors tend to be slow because they do not take advantage of the inherent parallelism of these devices, whereas hardware realizations based on heterogeneous platforms (combining central processing units (CPUs), graphics processing units (GPUs) and/or field-programmable gate arrays (FPGAs)) are designed based on different solutions using methodologies supported by different languages and using very different implementation criteria. This paper first presents a study that demonstrates the need for a heterogeneous (CPU-GPU-FPGA) platform to accelerate the optimization of artificial neural networks (ANNs) using genetic algorithms. Second, the paper presents implementations of the calculations related to the individuals evaluated in such an algorithm on different (CPU- and FPGA-based) platforms, but with the same source files written in OpenCL. The implementation of individuals on remote, low-cost FPGA systems on a chip (SoCs) is found to enable the achievement of good efficiency in terms of performance per watt.This research was funded by Spanish Agency of Research grant number FPA2016-78595-C3-3-R.Gadea Gironés, R.; Colom Palero, RJ.; Herrero Bosch, V. (2018). Optimization of Deep Neural Networks Using SoCs with OpenCL. Sensors. 18(5). https://doi.org/10.3390/s18051384S18

    Improving FPGA Based Impedance Spectroscopy Measurement Equipment by Means of HLS Described Neural Networks to Apply Edge AI

    Full text link
    [EN] The artificial intelligence (AI) application in instruments such as impedance spectroscopy highlights the difficulty to choose an electronic technology that correctly solves the basic performance problems, adaptation to the context, flexibility, precision, autonomy, and speed of design. Present work demonstrates that FPGAs, in conjunction with an optimized high-level synthesis (HLS), allow us to have an efficient connection between the signals sensed by the instrument and the artificial neural network-based AI computing block that will analyze them. State-of-the-art comparisons and experimental results also demonstrate that our designed and developed architectures offer the best compromise between performance, efficiency, and system costs in terms of artificial neural networks implementation. In the present work, computational efficiency above 21 Mps/DSP and power efficiency below 1.24 mW/Mps are achieved. It is important to remark that these results are more relevant because the system can be implemented on a low-cost FPGA.This work was supported in part by the Spanish MCIU under Project PID2020-116816RB-I00 (MCIU/FEDER) and in part by GVA under Project INNEST/2020/248.Fe, J.; Gadea Gironés, R.; Monzó Ferrer, JM.; Tébar Ruiz, Á.; Colom Palero, RJ. (2022). Improving FPGA Based Impedance Spectroscopy Measurement Equipment by Means of HLS Described Neural Networks to Apply Edge AI. Electronics. 11(13):1-14. https://doi.org/10.3390/electronics11132064114111

    From specialized to core course in Telecommunications degree: Experiences from digital electronic design and verification

    Get PDF
    [EN] The European Higher Education Area (EHEA) defines the competences for professional practice of a Telecommunications Engineer. The School of Telecommunication Engineering of the Universitat Politècnica de València (Valencia, Spain) provides an integrated education program consisting of a Graduate (GITST) + Master (MUIT). The GITST course offers four specialization tracks: Electronics, Telematics, Communication Systems and Multimedia for the proper acquisition of knowledge and competences of the future Telecommunications Engineers. In 2018, the graduate program has implemented a structural change in the organization of subjects for reinforcing important skills, in which a course on digital electronics design and verification (Integration of Digital Systems, ISDIGI) has been transformed into a core subject of the study plan. In this paper, we describe the methodology and adaptation of ISDIGI (i.e. a project-based learning intermediate HDL course that includes design and verification abilities) to the new GITST Curriculum. In addition, this paper describes the process of moving from specialized to core subject.Martínez Millana, A.; Liberos Mascarell, A.; Monzó Ferrer, JM.; Martínez Peiró, MA.; Martínez Pérez, JD.; Gadea Gironés, R. (2020). From specialized to core course in Telecommunications degree: Experiences from digital electronic design and verification. Editorial Universitat Politècnica de València. 229-238. https://doi.org/10.4995/INN2019.2019.10133OCS22923

    Evaluation of a Modular PET System Architecture with Synchronization over Data Links

    Full text link
    A DAQ architecture for a PET system is presented that focuses on modularity, scalability and reusability. The system defines two basic building blocks: data acquisitors and concentra- tors, which can be replicated in order to build a complete DAQ of variable size. Acquisition modules contain a scintillating crystal and either a position-sensitive photomultiplier (PSPMT) or an array of silicon photomultipliers (SiPM). The detector signals are processed by AMIC, an integrated analog front-end that generates programmable analog outputs which contain the first few statistical moments of the light distribution in the scintillator. These signals are digitized at 156.25 Msamples/s with free-run- ning ADCs and sent to an FPGA which detects single gamma events, extracts position and time information online using digital algorithms, and submits these data to a concentrator module. Concentrator modules collect single events from acquisition modules and perform coincidence detection and data aggregation. A synchronization scheme over data links is implemented that calibrates each link s latency independently, ensuring that there are no limitations on module mobility, and that the architecture is arbitrarily scalable. Prototype boards with both acquisition and concentration functionality have been built for evaluation pur- poses. The performance of a small PET system with two detectors based on continuous scintillators is presented. A synchronization error below 50 ps rms is measured, and energy resolutions of 19% and 24% and timing resolutions of 2.0 ns and 4.7 ns FWHM are obtained for PMT and SiPM photodetectors, respectively.Manuscript received June 25, 2013; revised November 06, 2013; accepted January 03, 2014. Date of publication January 29, 2014; date of current version February 06, 2014. This work was supported in part by the Spanish Ministry of Science and Innovation under CICYT Grant FIS2010-21216-C02-02.Aliaga Varea, RJ.; Herrero Bosch, V.; Monzó Ferrer, JM.; Ros García, A.; Gadea Gironés, R.; Colom Palero, RJ. (2014). Evaluation of a Modular PET System Architecture with Synchronization over Data Links. IEEE Transactions on Nuclear Science. 61(1):88-98. https://doi.org/10.1109/TNS.2014.2298399S889861

    PET System Synchronization and Timing Resolution Using High-Speed Data Links

    Full text link
    Current PET systems with fully digital trigger rely on early digitization of detector signals and the use of digital processors, usually FPGAs, for recognition of valid gamma events on single detectors. Timestamps are assigned and later used for coincidence analysis. In order to maintain a decent timing resolution for events detected on different acquisition boards, it is necessary that local timestamps on different FPGAs be synchronized. Sub-nanosecond accuracy is mandatory if we want this effect to be negligible on overall timing resolution. This is usually achieved by connecting all boards to a common backplane with a precise clock delivery network; however, this approach forces a rigid structure on the whole PET system and may pose scalability problems. © 2006 IEEE.Manuscript received June 14, 2010; revised November 18, 2010; accepted March 31, 2011. Date of publication April 21, 2011; date of current version August 17, 2011. This work was supported in part by the Spanish Ministry of Science and Innovation under FPU Grant AP2006-04275 and CICYT Grant FIS2010-21216-C02-02.Aliaga Varea, RJ.; Monzó Ferrer, JM.; Spaggiari, M.; Ferrando Jódar, N.; Gadea Gironés, R.; Colom Palero, RJ. (2011). PET System Synchronization and Timing Resolution Using High-Speed Data Links. IEEE Transactions on Nuclear Science. 58(4):1596-1605. https://doi.org/10.1109/TNS.2011.2140130S1596160558

    Evaluation of a timing integrated circuit architecture for continuous crystal and SiPM based PET systems

    Full text link
    [EN] Improving timing resolution in positron emission tomography (PET), thus having fine time information of the detected pulses, is important to increase the reconstructed images signal to noise ratio (SNR) [1]. In the present work, an integrated circuit topology for time extraction of the incoming pulses is evaluated. An accurate simulation including the detector physics and the electronics with different configurations has been developed. The selected architecture is intended for a PET system based on a continuous scintillation crystal attached to a SiPM array. The integrated circuit extracts the time stamp from the first few photons generated when the gamma-ray interacts with the scintillator, thus obtaining the best time resolution. To get the time stamp from the detected pulses, a time to digital converter (TDC) array based architecture has been proposed as in [2] or [3]. The TDC input stage uses a current comparator to transform the analog signal into a digital signal. Individually configurable trigger levels allow us to avoid false triggers due to signal noise. Using a TDC per SiPM configuration results in a very area consuming integrated circuit. One solution to this problem is to join several SiPM outputs to one TDC. This reduces the number of TDCs but, on the other hand, the first photons will be more difficult to be detected. For this reason, it is important to simulate how the time resolution is degraded when the number of TDCs is reduced. Following this criteria, the best configuration will be selected considering the trade-off between achievable time resolution and the cost per chip. A simulation is presented that uses Geant4 for simulation of the physics process and, for the electronic blocks, spice and Matlab. The Geant4 stage simulates the gamma-ray interaction with the scintillator, the photon shower generation and the first stages of the SiPM. The electronics simulation includes an electrical model of the SiPMarray and all the integrated circuitry that generates the time stamps. Time resolution results are analyzed using Matlab. The goal is to analyze the best resolution achievable with the SiPM and its degradation due to different circuitry configurations.This work was supported by local government Conselleria d’Educacio — Generalitat Valenciana research program GV/2011/068.Monzó Ferrer, JM.; Ros García, A.; Herrero Bosch, V.; Perino Vicentini, IV.; Aliaga Varea, RJ.; Gadea Gironés, R.; Colom Palero, RJ. (2013). Evaluation of a timing integrated circuit architecture for continuous crystal and SiPM based PET systems. Journal of Instrumentation. 8. https://doi.org/10.1088/1748-0221/8/03/C03017S8Moses, W. W. (2003). Time of flight in pet revisited. IEEE Transactions on Nuclear Science, 50(5), 1325-1330. doi:10.1109/tns.2003.817319Fang, X., Ollivier-Henry, N., Gao, W., Hu-Guo, C., Colledani, C., Humbert, B., … Hu, Y. (2011). IMOTEPAD: A mixed-signal 64-channel front-end ASIC for small-animal PET imaging. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 634(1), 106-112. doi:10.1016/j.nima.2011.01.082Abidi, M., Koua Calliste, K., Kanoun, M., Panier, S., Arpin, L., Tetraul, M.-A., … Fontaine, R. (2010). A Delay Locked Loop for fine time base generation in a positron emission tomography scanner. 5th International Conference on Design & Technology of Integrated Systems in Nanoscale Era. doi:10.1109/dtis.2010.5487578Karp, J. S., Surti, S., Daube-Witherspoon, M. E., & Muehllehner, G. (2008). Benefit of Time-of-Flight in PET: Experimental and Clinical Results. Journal of Nuclear Medicine, 49(3), 462-470. doi:10.2967/jnumed.107.044834Monzo, J. M., Aliaga, R. J., Herrero, V., Martinez, J. D., Mateo, F., Sebastia, A., … Pavon, N. (2008). Accurate Simulation Testbench for Nuclear Imaging Systems. IEEE Transactions on Nuclear Science, 55(1), 421-428. doi:10.1109/tns.2007.912878Avella, P., De Santo, A., Lohstroh, A., Sajjad, M. T., & Sellin, P. J. (2012). A study of timing properties of Silicon Photomultipliers. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 695, 257-260. doi:10.1016/j.nima.2011.11.049Seifert, S., van Dam, H. T., Huizenga, J., Vinke, R., Dendooven, P., Lohner, H., & Schaart, D. R. (2009). Simulation of Silicon Photomultiplier Signals. IEEE Transactions on Nuclear Science, 56(6), 3726-3733. doi:10.1109/tns.2009.2030728Corsi, F., Marzocca, C., Perrotta, A., Dragone, A., Foresta, M., Del Guerra, A., … Levi, G. (2006). Electrical Characterization of Silicon Photo-Multiplier Detectors for Optimal Front-End Design. 2006 IEEE Nuclear Science Symposium Conference Record. doi:10.1109/nssmic.2006.35607

    Diseño y primeros resultados de una cámara PET para animales pequeños basada en cristales LYSO continuos y fotomulplicadores sensibles a la posición

    Full text link
    [ES] En este artículo presentamos el diseño de un nuevo escáner PET para animales pequeños basado en una tecnología completamente innovadora. Los resultados preliminares son muy prometedores, permitiendo obtener imágenes funcionales de alta resolución con una instrumentación compacta y de bajo coste. Los prototipos desarrollados se encuentran actualmente en pruebas en diversos centros de investigación médica, obteniéndose imá- genes de alta calidad en los campos de oncología, neurología y cardiología. Este diseño puede ser fácilmente extendido a cámaras PET dedicadas a la exploración del cerebro o de la mama. La innovación más notable de la presente tecnología consiste en el uso de un único cristal continuo por módulo, a diferencia de otras cámaras PET comerciales donde se utilizan cientos de cristales pixelados. El uso de un único cristal continuo permite abaratar el coste de fabricación, al tiempo que mejora las características de funcionamiento: resolución intrínseca en posición de 1,2 mm, resolución en energía media del 14%, resolución en la profundidad de interacción de 3 mm, sensibilidad > 4% y campo de visión transaxial de 80 mm de diámetro. En el presente artículo se describen en detalle el diseño de esta nueva cámara PET, los principios de funcionamiento, el método utilizado para su calibración y se anticipan algunas imágenes "in vivo" del miocardio y el cerebro de un ratón, permitiendo apreciar de forma preliminar la resolución y calidad alcanzadas.[EN] In this paper we present the design of a new small animal PET scanner based on a completely innovative technology. The achieved results are very promising, showing the possibility to obtain high resolution functional images with a compact and low cost scanner. Several prototypes have been developed and are currently being used at different research medical institutions. High resolution images are being obtained in application fields like oncology, neurology and cardiology. This technology can be easily applied in PET cameras for brain or breast exploration. The most significant innovation of the design is the fact of using a single crystal per module instead of hundreds of pixellated crystals as in other commercial PETs. It has the advantage of decreasing the manufacturing costs and simultaneously improves its performance: 1.2 mm position intrinsic resolution, mean energy resolution as good as 14%, 3 mm depth of interaction resolution, sensitivity above 4%, and 80 mm diameter of transaxial field of view. In this paper we describe in detail the design of this new PET camera, its principle, the calibration methodology and also some preliminary "in vivo" images of a mouse myocardium and brain, showing the achieved image resolution and qualityBenlloch Baviera, JM.; González Martínez, AJ.; Carrilero, V.; Catret, JV.; Correcher, C.; Lerche, CW.; Morera, C.... (2007). Diseño y primeros resultados de una cámara PET para animales pequeños basada en cristales LYSO continuos y fotomulplicadores sensibles a la posición. Revista de física médica. 8(2):315-321. http://hdl.handle.net/10251/79285S3153218

    Máquina Moore: Estilo de 2 procesos

    Full text link
    Descripción de máquinas de estados con VHDL. Tipo de máquina: Moore. Estilo de descripción: dos procesos.http://labvirtual.webs.upv.es/FSM_Moore_2.htmGadea Gironés, R. (2011). Máquina Moore: Estilo de 2 procesos. http://hdl.handle.net/10251/1400

    FSM estilo de un proceso

    Full text link
    Descripción de FSM en Verilog mediante un solo procesohttps://polimedia.upv.es/visor/?id=af9ac480-bef6-11eb-a2dc-25bae68df3bfGadea Gironés, R. (2021). FSM estilo de un proceso. http://hdl.handle.net/10251/167435DE
    corecore